Competitive Evaluation of Commercially Available Speech Recognizers in Multiple Languages
نویسندگان
چکیده
Recent improvements in speech recognition technology have resulted in products that can now demonstrate commercial value in a variety of applications. Many vendors are marketing products which combine ASR applications including continuous dictation, command-and-control interfaces, and transcription of recorded speech at an accuracy of 98%. In this study, we measured the accuracy of certain commercially available desktop speech recognition engines in multiple languages. Using word error rate as a benchmark, this work compares recognition accuracy across eight languages and the products of three manufacturers. Results show that two systems performed almost the same while a third system recognized at lower accuracy, although none of the systems reached the claimed accuracy. Read speech was recognized better than spontaneous speech. The systems for US-English, Japanese and Spanish showed higher accuracy than the systems for UK-English, German, French and Chinese.
منابع مشابه
Practical Evaluation of Speech Recognizers for Virtual Human Dialogue Systems
We perform a large-scale evaluation of multiple off-the-shelf speech recognizers across diverse domains for virtual human dialogue systems. Our evaluation is aimed at speech recognition consumers and potential consumers with limited experience with readily available recognizers. We focus on practical factors to determine what levels of performance can be expected from different available recogn...
متن کاملBoosting under-resourced speech recognizers by exploiting out-of-language data - case study on Afrikaans
Under-resourced speech recognizers may benefit from data in languages other than the target language. In this paper, we boost the performance of an Afrikaans speech recognizer by using already available data from other languages. To successfully exploit available multilingual resources, we use posterior features, estimated by multilayer perceptrons that are trained on similar languages. For two...
متن کاملAutomatic Generation of Pronunciation Dictionaries
In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...
متن کاملDevelopment of a Cantonese-English code-mixing speech corpus
This paper describes the design and compilation of the CUMIX Cantonese-English code-mixing speech corpus. Code-mixing is a common phenomenon in many bilingual societies and it usually involves at least two different languages within one utterance. In Hong Kong, people usually mix English words and phrases with Cantonese in their daily conversation. Although there are many monolingual corpora of...
متن کاملEuropean Organisation for the Safety of Air Navigation Eurocontrol Experimental Centre
This report discusses the use of situation knowledge as a means to enhance the recognition performance of commercially available automatic speech recognizers. A cognitive model of the ATC controller is proposed that continuously observes the present situation and generates a prediction of the sentences the controller is most likely to say. The prediction is made available to the speech recogniz...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006